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^! 1 INTRODUCTION 

^ . Among the most luminous bodies in the universe are the 
' brightest, or first-ranked, galaxies in rich clusters. These 
£^ galaxies have absolute magnitudes between -21.5 and -23.3 
• • ■ and are among the farthest observable objects. In addition, 
, ' the magnitudes of these brightest cluster galaxies (BCGs) 
, are highly uniform, with a dispersion of 0.32 magnitudes 
5_j ■ (Hoessel & Schneider 1985). Their uniformity and large lu- 
CS ' minosity make BCGs excellent standard candles. The uni- 
formity of BCG magnitudes raises a particularly impor- 
tant question regarding their nature (Peebles 1968; Sandage 
1972). Are BCGs simply the brightest of a statistical set of 
galaxies or do they belong to a special class of objects? If 
a special class of galaxies exists, do all clusters have special 
galaxies and are they always first- ranked (Bhavsar 1989)? 
We investigate these questions using extreme value theory 
(Fisher & Tippett 1928). 



ABSTRACT 

The brightest, or first-ranked, galaxies (BCGs) in rich clusters show a very small dis- 
persion in luminosity, making them excellent standard candles. This small dispersion 
raises questions about the nature of BCGs. Are they simply the extremes of normal 
galaxies formed via a stochastic process, or do they belong to a special class of atypical 
objects? If they do, are all BCGs special, or do normal galaxies compete for the first 
rank? To answer these questions, we undertake a statistical study of BCG magnitudes 
using results from extreme value theory. Two-population models do better than do 
one-population models. A simple model where a random boost in the magnitude of a 
fraction of bright normal galaxies forms a class of atypical galaxies best describes the 
observed distribution of BCG magnitudes. 

Key words: methods: statistical, galaxies: clusters: general, galaxies: cD, galaxies: 
evolution 



sire to understand these types of phenomena prompts the 
study of extreme value theory. 

Fisher & Tippett (1928) show that the distribution of 
statistically largest or smallest extremes tends asymptoti- 
cally to a well-determined and analytic form for a general 
class of parent distributions. Extremes drawn from suffi- 
ciently large and steeply falling parent distributions have 
this form. One may find the original argument in Fisher & 
Tippett (1928). Their derivation is reconstructed in greater 
detail by Bhavsar & Barrow (1985), who apply extreme 
value theory in an analysis of BCG magnitudes. Fisher & 
Tippett 's result states that the cumulative distribution of 
maximum extremes is given by: 



F(x) = e 



2 — a(x — aj ) 



(1) 



This distribution is known as the Gumbel distribution. (For 
smallest extremes, one substitutes x — * —x.) From F we 
may calculate the differential distribution (or probability 
density) : 



2 EXTREME VALUE THEORY 

The motivation for studying extreme phenomena is prac- 
tical. Many of the memorable experiences in our lives can 
be classified as statistical extremes. Examples of maximum 
extremes are floods, the hottest summer temperatures and 
the lengths of the longest caterpillars. Examples of minimum 
extremes are draughts, stock market crashes and the wing- 
spans of the smallest hummingbirds. Some extremes do not 
effect our lives and others turn them upside down. The de- 



-cl(x — xq)- 



(2) 



where f(x) = F'(x); xq is the mode of the extremes and 
a > is a measure of the steepness of fall of the parent 
distribution. The probability density is normalized to unity. 
The mean, median and standard deviation of the distribu- 
tion given in Bhavsar & Barrow (1985) correspond to: 



< X >= XQ + 



0.577 



med(x) — xq + 



0.367 



a = 6^' (3) 
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Figure 1. The Gumbcl distribution for maximum extremes. 



where 0.577 w -r'(l) is Euler's constant, 0.367 w ln(ln(2)) 
and cr is the standard deviation of the extremes. The stan- 
dard form for the Gumbel, -F(a:) and f(x), is shown in Fig. 
1. Note that for BCGs, we will be considering minimum ex- 
tremes (because more negative magnitudes are brighter) and 
the curves will be inverted [x — > — x). Henceforth, we will 
call f(x) the Gumbel distribution. 



3 BRIGHTEST CLUSTER GALAXIES 
3.1 Past Results 

Researchers have described BCGs as special, statistical ex- 
tremes of a normal population and a mixture of the two 
(Peebles 1968; Peach 1969; Sandage 1972, 1976; Bhavsar & 
Barrow 1985; Bhavsar 1989; Postman & Lauer 1995). 

The motivation for proposing that BCGs are special 
is due to the small dispersion observed in BCG magni- 
tudes (Peach 1969; Sandage 1972, 1976). These authors ar- 
gue that such a small dispersion is not sufficiently explained 
by the steepness of the luminosity function. In addition, as- 
tronomers observe a class of BCGs that are morphologically 
different, called cD galaxies. These galaxies are giant ellip- 
ticals and often have features, such as multiple nuclei and 
large envelopes, that distinguish them from normal galaxies. 

On the other hand, Peebles (1968) argues that BCGs 
are just the extreme tail-end of normal galaxies that form in 
clusters via some stochastic process. In this case, the bright- 
est galaxy in a given cluster is simply the brightest normal 
galaxy and, therefore, the distribution of BCG magnitudes 
is a Gumbel. (It is interesting to note that Peebles, indepen- 
dently of Fisher & Tippett, derived the Gumbel distribution 
for BCGs for the special case of an exponential luminosity 
function.) 

Bhavsar (1989) contends neither of these scenarios ad- 
equately describes the observed distribution of BCG mag- 
nitudes and argues for a mixed population. Suppose that 
a special class of Galaxies exists but that not all clusters 
have a special galaxy. In clusters with no special galaxy, 
the BCG is simply the brightest normal galaxy. In a cluster 
containing at least one special galaxy, either all the nor- 
mal galaxies are fainter, or the brightest normal galaxy(ies) 
out-shines the special one(s) and attains the first rank. For 
these reasons, one might expect both types of galaxies to 
comprise the BCG population. In what follows, we investi- 
gate these assumptions quantitatively by analyzing Lauer & 



Postman's (1994) data set and revisiting the one used by 
Bhavsar (1989). 



3.2 The Distribution of BCG magnitudes 

In the case of one population, the distribution function is 
straight forward. If BCGs are all drawn from a special class 
of objects, it has been assumed that BCG magnitudes are 
normally distributed (Peach 1969; Sandage 1972, 1976; Post- 
man & Lauer 1995). In this case, referred to henceforth as 
model A, the probability distribution of special galaxies, f sp , 
is a Gaussian, f g , with mean M g , standard deviation a and 
normalization such that the integral over all magnitudes, M, 
is unity. The distribution function is as follows: 



f sp (M) = f B = 



2tt 



(4) 



If BCGs are simply the brightest of a normal set of galaxies 
(Peebles 1968), henceforth referred to as model B, the prob- 
ability distribution of their magnitudes, f n0 r, is a Gumbel, 
fc given by Equation (2), with x —> —M and Xo — > M* = 
M G + (Bhavsar & Barrow 1985): 



fnor(M) = fa = ae 



a(M-M*)-e a ^ M - M ""> 



(5) 



where Mq is the mean of the extremes and a is a measure 
of the steepness of fall of the parent distribution. 

In the case of two populations (Bhavsar 1989), we derive 
the distribution that M should have from the contributions 
of the two individual populations. Consider N clusters of 
galaxies and suppose that n < N have at least one special 
galaxy. Let the independent magnitude-distribution of nor- 
mal and special galaxies, respectively, be fnor and f sp - The 
total magnitude-distribution function, ftot, is then given by: 

ftat(M) = d-[f sp - i f nor (M')dM' + 

J M 



fnor ■ / f sp (M')dM'] + 

J M 
(1 cQ fnor , 



(6) 



where d = n/N . The first (second) term is the probability of 
picking a special (normal) galaxy, with absolute magnitude 
M, from a cluster containing both populations with the con- 
dition that all the normal (special) galaxies are fainter. The 
third term gives the probability of picking a galaxy, with 
absolute magnitude M, in clusters containing only normal 
galaxies. Equation (6) is true for all well-behaved functions 
f„or and fsp. If fnor and fsp are normalized to unity, then so 
is the resulting total distribution function ftot- (Note that 
Equation (6) works, in general, whenever there are two in- 
dependent populations competing for first rank.) 

For BCGs, we consider three different two-population 
models. The first is the case discussed above with the bright- 
est normal galaxies comprising one population and a spe- 
cial class of galaxies comprising the other. We call this case 
model C and write the total distribution as fc g (where 'Gg' 
stands for 'Gumbel + <?aussian'). To obtain the final form 
of fG g we note that: 



la = 



f G (M')dM' = F(M), 



(7) 
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Table 1. Distribution components for the five models. 

MODEL f nor Sap 

A /„ 

B f G 

C fa f B 

D fgl f„a 

E Sgi SG2 

where F(M) is given by Equation (1) with x — > — M, and 
xq — > Ma + J ■ Second, we note that: 

f oo 

Ig = f g (M')dM' = (l±er/|Af- M s |)/2, (8) 

J M 

where erf is the error function. The upper sign is for M < 
M g and the lower sign is for M > M g . Thus, we may rewrite 
fag by substituting in la and I g : 

f Gg (M) =d-[f g -I G + fG- I g ] + (1 - d)f G . (9) 

Other possible combinations of assigning fa and f g to the 
two populations result in models D and E. In the case of 
model D, both distributions are Gaussian and the total dis- 
tribution function, f gg , is given by: 

f gg (M) = d ■ [f g2 ■ I gl + f gl ■ I g2 ] + (1 - d)f gl , (10) 

where the notation is self-evident and the two Gaussians are 
characterized, respectively, by M g i, o\ and M g2 , <j 2 . In the 
case of model E, both distributions are Gumbels (f sp is also 
a Gumbel) and the total distribution function, faa, is given 
by: 

/go (AO = d ■ [f G2 ■ Iai + fai • /go] + (1 - 0/oi, (11) 

where the two Gumbels are characterized, respectively, by 
Mai, a,i and M G2 , a 2 . Table 1 summarizes the forms of the 
five models. 



4 MODELING THE DATA 
4.1 Data Sets 

We utilize two data sets from the literature. First, we rean- 
alyze the data used by Bhavsar (1989). This is a 93 member 
subset of 116 metric BCG visual-intrinsic (VI) magnitudes 
compiled by Hoessel, Gunn & Thuan (1980), henceforth re- 
ferred to as "HGT". These 93 are the data from clusters 
of richness and 1 only; Bhavsar ignores the rest of the 
BCGs in order to keep the data set homogeneous. The BCG 
magnitudes are internally consistent to 0.04 magnitudes, as 
published in HGT. Second, we analyze the 119 metric BCG 
magnitudes, taken in the Kron-Cousins R c band, compiled 
by Lauer & Postman (1994), henceforth referred to as "LP". 
The data were corrected for local and possible large scale 
galactic motions. The 119 LP data are comprised of BGCs 
from 107 clusters of richness & 1, and 9, 2 and 1 of richness 
2, 3 and 4, respectively (Abell, Corwin & Olowin 1989). We 
find that removing the 12 BCGs from clusters of richness 
class > 2 does not significantly change the distribution of 
the LP data. This is consistent with Sandage's (1976) result 
that BCG magnitude is independent of cluster-richness. The 
internal consistency of the set is 0.014 magnitudes, as pub- 
lished in Postman & Lauer (1995). Bhavsar (1989) proposes 



a two-population model for the HGT data. His maximum- 
likelihood fit is consistent with the data and has parameter- 
values consistent with physically measured quantities. Post- 
man & Lauer (1995) conclude that the LP data are consis- 
tent with a Gaussian. 

There are differences in the data sets that could be the 
reason for the disagreement between Bhavsar (1989) and 
Postman & Lauer (1995). The two were obtained in different 
optical bands. The mean of the HGT data set is 0.2 magni- 
tudes brighter than the mean of the LP data set. The two 
data sets have 34 galaxies in common. Comparing the subset 
of 34, we find that the HGT values are, on average, 0.06 ± 
0.19 magnitudes brighter than the LP values. A two-sample 
Kolmogorov-Smirnov (K-S) Test addresses the consistency 
of the two data sets in describing the same population of 
objects. The null hypothesis is that the same distribution 
describes both data sets. We find that the two data sets fail 
the null hypothesis at the 82% confidence level. Therefore, 
we do not expect the same parameters or distribution to de- 
scribe both sets. These discrepancies may need further in- 
vestigation, but such an analysis is outside the scope of this 
work. We investigate each data set separately and present 
our results. 



4.2 Fitting Method 

We consider models A-E discussed above. The two- 
population distributions have five parameters each: two 
means, two standard deviations and the fraction, d, of clus- 
ters that contain a special population of galaxies. If there 
is no population of special galaxies, then d — 0. We use 
maximum-likelihood fitting. The theory behind this method 
is discussed in Press, et al. (1992). The Maximum-Likelihood 
fit to a data set of size N for a function, /, are the parame- 
ters, a, that maximize the likelihood function: 

N 

L = JJ/(ar i ;a), (12) 

where the f(xi\ a) are the values of the probability density, 
/, evaluated at each of the N data points, Xi. For a certain /, 
one finds the set of parameters that maximizes the product, 
L. 



5 RESULTS 

5.1 Parameters and Fits 

After obtaining parameters from the maximum-likelihood 
method for models A-E for both data sets, we compute the 
K-S statistics. We list the results in Tables 2 & 3, respec- 
tively. Lower values of the K-S D-statistic correspond to 
lower values of rejection probability, P, and thus denote a 
better fit. Figs. 2 & 3 illustrate the performance of each of 
the five models. Note that the distributions use the param- 
eters obtained by the maximum-likelihood method, using 
every data point, and are not a fit to the particular his- 
tograms. 
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Table 2. Fit-parameters for the HGT data for models A-E. 



MODEL A 
M s =-22.63 

cr=0.34 
D=0.0876 
P=0.531 



MODEL B 
M G =-22.66 

a=2.82 
D=0.1174 
P=0.848 



MODEL C 
M G =-22.30 
M g =-22.79 
d=0.64 
o=4. H 
cr=0.20 
D=0.0562 
P=0.063 



MODEL D 
M g i =-22.29 
M 9 2=-22.83 
d=0.62 
o-i=0.24 
(72=0.19 
15=0.0519 
P=0.032 



MODEL E 
Mq! =-22.40 
M G2 =-22.86 
d=0.48 
ai=3.70 
a 2 =8.83 
D=0.0525 
P=0.036 



Table 3. Fit-parameters for the LP data for models A-E. 



MODEL A 
M g =-22.43 

o=0.33 
D=0.0565 
P=0.162 



MODEL B 
M G =-22.45 

a=2.99 
D=0.1173 
P=0.926 



MODEL C 
M G =-21.84 
M g =-22.44 
d=0.95 
o=5.11 
ct=0.32 
D=0.0570 
P=0.158 



MODEL D 
Af 9 i=-22.H 
Mg 2 =-22.52 
a=0.72 
oi=0.26 
02=0.30 
D=0.0527 
P=0.098 



MODEL E 
M G1 =-22.18 
M G2 =-22.52 
d=0.64 
ai=3.65 
02=5.33 
D=0.0421 
P=0.014 




-23.1 -22.8 -22.5 -22.2 -21.9 -23.3 -22.9 -22.5 -22.1 -21.7 



M(VI) 



M(VI) 



Figure 2. Left-hand column shows cumulative distribution func- 
tion for the HGT data and the maximum-likelihood fits for each of 
the five models. Right-hand column shows HGT histogram with 
a plot of the differential distribution for each of the five models. 




-23.0 -22.7 -22.4 -22.2 -21 

M(Rc) 



-23.2 -22.8 -22.4 -22.0 -21.6 



M(Rc) 



Figure 3. Left-hand column shows cumulative distribution func- 
tion for the LP data and the maximum-likelihood fits for each of 
the five models. Right-hand column shows LP histogram with a 
plot of the differential distribution for each of the five models. 



5.2 Comparison with Previous Work 

We compare our results with Bhavsar (1989) and Postman 
& Lauer (1995). Bhavsar's (1989) two-population model is 
our model C. He uses maximum-likelihood fitting and his 
best-fitting parameters are M G = -22.31, M g = -22.79, d = 
0.63, a = 4.01 and a = 0.21. Our parameters are in excellent 
agreement. Minor variation is expected due to differences in 
fitting techniques. Postman & Lauer (1995) argue against 
Bhavsar's two-population model and claim that BCG mag- 
nitudes are Gaussian, based on a 26% confidence level. 

In agreement with both Bhavsar (1989) and Postman 
& Lauer (1995), it is clear from Tables 2 & 3 and Figs. 2 & 
3 that for both data sets no Gumbel distribution describes 
the BCG data. This rejects the Gumbel hypothesis (model 
B) with 85% and 93% confidence levels, respectively, for the 
HGT and LP sets. For the HGT data, the Gaussian fails at 
the 53% confidence level, while for the LP data, the rejection 
confidence is 16%. The difference between our value of 16% 
and Postman & Lauer's value arises because our result is 
for the maximum-likelihood fit Gaussian, while Postman & 
Lauer's is for a Gaussian with the same mean and standard 
deviation as the LP data. 

The relatively high rejection-confidence of the one- 
population models has motivated us to investigate two- 
population models. The presence of cD galaxies strongly 
suggests the possibility of another population. Overall, the 
two-population models fit the data much better than do 
the Gumbels and as well or better than do the respective 
Gaussians. The larger number of parameters is taken into 
account by the statistical estimators when calculating the 
confidence of rejecting the null hypothesis. Moreover, the 
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parameters are physical quantities that are observationally 
verifiable (Bhavsar 1989). 

Our result that no one model or set of parameters de- 
scribes both data sets is consistent with the fact that a two- 
sample K-S Test indicates that the sets are not consistent 
with one another. Postman & Lauer (1995) have raised ques- 
tions regarding HGT's BCG classification and sky subtrac- 
tion. 



5.3 Physical Motivation 

Researchers have suggested various mechanisms whereby a 
second population with a brighter average metric magnitude 
could evolve from the bright normal galaxies. Cannibalism, 
the process by which large galaxies in the central regions 
of rich clusters grow at the expense of smaller galaxies (Os- 
triker & Hausman 1977; Hausman & Ostriker 1978), is one 
possibility. The existence of giant elliptical and cD galaxies 
near the centre of approximately half of all rich clusters sup- 
ports this hypothesis. These galaxies always lie at the tail- 
end of their cluster-luminosity functions. The occurrence of 
cannibalism continues to be debated (Merritt 1984). 

Motivated by the existence in the literature of strong ar- 
guments for such a process, we build a very simple schematic 
to study its statistical effects on the population of first- 
ranked galaxies. We make two assumptions: (i) at an early 
epoch the BCGs all belonged to one population and (ii) 
galaxies from the bright end of this population evolve, re- 
sulting in a random boost to their luminosity. We construct 
a set of N galaxies with an exponential luminosity function 
between absolute magnitudes -22.0 and -23.0. This repre- 
sents the galaxies at the bright end of cluster luminosity 
functions that are candidates for a boost. A random num- 
ber, n, of these galaxies undergo a random boost between 0.1 
and 0.9 magnitudes. We label the boosted subset as rif,. We 
choose this range for the following reasons. First, Hausman 
& Ostriker (1978) show via a simulation that one would 
expect a large galaxy to gain, on average, 0.5 magnitudes 
during its first cannibalistic encounter. This is consistent 
with Aragon-Salamanca, Baugh & Kauffmann (1998), who 
state that BCGs were approximately 0.5 magnitudes fainter 
at z = 1. Second, we limit ourselves to one encounter be- 
cause Merritt (1984) argues that the time scale for galactic 
encounters is too long for cannibalism to be common in the 
universe. We wish to investigate the magnitude-distribution 
of the resulting boosted population. These represent the spe- 
cial galaxies mentioned previously. Specifically, this distribu- 
tion could give us insight into the form of f sp . 

To our surprise, we find that the distribution, f sp , of n;, 
is a Gumbel! The K-S Test rejects the Gaussian hypothesis 
at the 98% confidence level. Conversely, the Gumbel dis- 
tribution, with the same mean and deviation as the data, 
fits well, with only a 7% confidence level for rejection. We 
summarize these results in Table and Fig. 4. Thus, the two- 
population model E (a combination of two Gumbels), which 
is best-fitting for the newer LP data, has a physical basis. 



6 CONCLUSION 

For more than thirty years, cosmologists have debated the 
nature of the magnitude-distribution of brightest cluster 



Table 4. Fit-Parameters for the n». data. 



GAUSSIAN FIT 



M g 
a 
D-- 
P 



=-22.64 
=0.28 
:0.0827 
=0.978 



GUMBEL FIT 
M G =-22.64 

a=4.58 
L>=0.0296 
P=0.067 




-23.8 -23.6 -23.4 -23.2 



22.8 -22.6 -22.4 -22.2 -22 
M 



Figure 4. The end-result of boosting an initial exponential lu- 
minosity function compared with a Gaussian and a Gumbel. 



galaxies. Peebles (1968) and Sandage (1972, 1976) & Peach 
(1969) reach markedly different conclusions. More recently, 
Bhavsar (1989) and Postman & Lauer (1995) differ regard- 
ing the population(s) that comprise the first-ranked galaxies. 
In light of this controversy, we have conducted a new exam- 
ination of the distribution of BCG magnitudes. We consider 
the BCGs as a class of objects to which we may apply well 
established results from extreme value theory. We find that 
there are a number of models that perform well in describ- 
ing the HGT and LP data sets. Though a Gaussian fits both 
data sets, the confidence limits warrant further investigation 
of two-population models. 

Tables 2 & 3 clearly show that we should reject the 
Gumbel (model B) as a fit, i.e., the hypothesis that all 
BCGs are statistical extremes. The Gaussian (model A) 
is marginally acceptable but without physical basis. Two- 
population models, in particular, the three combinations of 
/g and f g , describe the data very well. Tables 2 & 3 show 
their relative merits. Model E stands out as giving the best 
overall fit and is motivated by a physical basis. Therefore, 
it is most likely that there are two populations of BCGs: 
the extremes of a normal population and a class of atypical 
galaxies with a brighter average mean. 



We thank Marc Postman for sending us the LP data. 
This research was supported by an ANN grant from the US 
Department of Education and the Kentucky Space Grant 
Consortium. 
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